Deploying Cloud Native Service Add-On Pack

NVIDIA Cloud Native Stack

This guide will walk through the deployment and setup steps of the NVIDIA Cloud Native Service Add-On Pack on an upstream Kubernetes deployment, such as the NVIDIA Cloud Native Stack. This is the simplest deployment configuration, where all components provided are installed directly on the cluster, with no integration to external services.

Note

The NVIDIA Cloud Native Stack deployment using upstream Kubernetes should be used for evaluation and development purposes only. It is not designed for production use.


  1. The following steps assume that the Requirements section has been met, and the NVIDIA Cloud Native Stack K8S cluster has already been set up from the steps in the previous section.

  2. Ensure that a kubeconfig is available and set via the KUBECONFIG environment variable or your user’s default location (.kube/config).

  3. Ensure that a FQDN and wildcard DNS entry are available and resolvable for the K8S cluster that was created.

  4. Download the NVIDIA Cloud Native Service Add-on Pack from the Enterprise Catalog onto the instance you have provisioned from here.

    Copy
    Copied!
                

    ngc registry resource download-version "nvaie/nvidia_cnpack:0.2.1"

    Note

    If you still need to install and set up the NGC CLI with your API Key, please do so by autoloading the resource. Instructions can be found here.


  5. Navigate to the installer’s directory using the following command:

    Copy
    Copied!
                

    cd nvidia_cnpack_v0.2.1


  6. Create a config file for the installation using the following template as a minimal config file. For full details on all the available configuration options, please reference the Advanced Usage section of the Appendix.

    Note

    Make sure to change the wildcardDomain field to match the DNS FQDN and wildcard record created as described in the Requirements section.

    Copy
    Copied!
                

    apiVersion: v1alpha1 kind: NvidiaPlatform spec: platform: wildcardDomain: "*.my-cluster.my-domain.com" externalPort: 443 ingress: enabled: true postgres: enabled: true certManager: enabled: true trustManager: enabled: true keycloak: databaseStorage: accessModes: - ReadWriteOnce resources: requests: storage: 1G storageClassName: local-path volumeMode: Filesystem prometheus: storage: accessModes: - ReadWriteOnce resources: requests: storage: 1G storageClassName: local-path volumeMode: Filesystem grafana: enabled: true elastic: enabled: true

    Note

    If you installed local-path-provisioner, the storageClassName can be left as shown: local-path


  7. Make the installer executable via the following commands:

    Copy
    Copied!
                

    chmod +x ./nvidia-cnpack-linux-x86_64


  8. Run the following command on the instance to set up NVIDIA Cloud Native Service Add-on Pack:

    Copy
    Copied!
                

    ./nvidia-cnpack-linux-x86_64 create -f config.yaml


  9. Once the install is complete, check that all the pods are healthy via the following command:

    Copy
    Copied!
                

    kubectl get pods -A


    The output should look similar to the screenshot below:

    image2.png

  10. As a part of the installation, the installer will create nvidia-platform and nvidia-monitoring namespaces that contain most of the components and information required for interacting with the deployed services.

    • The default Keycloak instance URL is at: https://auth.my-cluster.my-domain.com

    • Default admin credentials can be found within the nvidia-platform namespace, in a secret called keycloak-initial-admin via the following commands:

    Copy
    Copied!
                

    kubectl get secret keycloak-initial-admin -n nvidia-platform -o jsonpath='{.data.username}' | base64 -d kubectl get secret keycloak-initial-admin -n nvidia-platform -o jsonpath='{.data.password}' | base64 -d

    • The default Grafana instance URL is at: https://dashboards.my-cluster.my-domain.com

    • The default Grafana credentials can be found within the nvidia-monitoring namespace, in a secret called grafana-admin-credentials via the following command:

    Copy
    Copied!
                

    kubectl get secret grafana-admin-credentials -n nvidia-monitoring -o jsonpath='{.data.GF_SECURITY_ADMIN_USER}' | base64 -d kubectl get secret grafana-admin-credentials -n nvidia-monitoring -o jsonpath='{.data.GF_SECURITY_ADMIN_PASSWORD}' | base64 -d


  11. You can configure the components and services installed on the cluster as required for your use case. Specific examples can be found in the NVIDIA AI Workflow Guides.

© Copyright 2022-2023, NVIDIA. Last updated on May 23, 2023.